A Bayesian Approach to Voice Conversion Based on GMMs Using Multiple Model Structures
نویسندگان
چکیده
A spectral conversion method using multiple Gaussian Mixture Models (GMMs) based on the Bayesian framework is proposed. A typical spectral conversion framework is based on a GMM. However, in this conventional method, a GMM-appropriate number of mixtures is dependent on the amount of training data, and thus the number of mixtures should be determined beforehand. In the proposed method, the variational Bayesian approach is applied to GMM-based voice conversion, and multiple GMMs are integrated as a single statistical model. Appropriate model structures are stochastically selected for each frame based on the Bayesian frame work.
منابع مشابه
A Bayesian Approach to Speaker Recognition Based on GMMs Using Multiple Model Structures
This paper proposes a speaker recognition technique using multiple model structures based on the Bayesian approach. In recent speaker recognition, many sophisticated statistical models have been proposed, e.g., Joint Factor Analysis and i-Vector based method. However, since most of them are based on Gaussian Mixture Models (GMMs), therefore improving estimation accuracy of generative models, i....
متن کاملUsing Context-based Statistical Models to Promote the Quality of Voice Conversion Systems
This article aims to examine methods of optimizing GMM-based voice conversion systems performance in which GMM method is introduced as the basic method for improvement of voice conversion systems performance. In the current methods, due to using a single conversion function to convert all speech units and subsequent spectral smoothing arising from statistical averaging, we will observe quality ...
متن کاملOne-to-Many Voice Conversion Based on Tensor Representation of Speaker Space
This paper describes a novel approach to flexible control of speaker characteristics using tensor representation of speaker space. In voice conversion studies, realization of conversion from/to an arbitrary speaker’s voice is one of the important objectives. For this purpose, eigenvoice conversion (EVC) based on an eigenvoice Gaussian mixture model (EV-GMM) was proposed. In the EVC, similarly t...
متن کاملSpeech Enhancement Using Gaussian Mixture Models, Explicit Bayesian Estimation and Wiener Filtering
Gaussian Mixture Models (GMMs) of power spectral densities of speech and noise are used with explicit Bayesian estimations in Wiener filtering of noisy speech. No assumption is made on the nature or stationarity of the noise. No voice activity detection (VAD) or any other means is employed to estimate the input SNR. The GMM mean vectors are used to form sets of over-determined system of equatio...
متن کاملA Voice Conversion Method Combining Segmental GMM Mapping with Target Frame Selection
In this paper, a voice conversion approach that combines two distinct ideas is proposed to improve the converted-voice quality. The first idea is to map spectral features, e.g. discrete cepstrum coefficients (DCC), with segmental Gaussian mixture models (GMMs). That is, a single GMM of a large number of mixture components is replaced here with several voice-content specific GMMs each consisting...
متن کامل